Course: Visual Analytics for Policy and Management

Final Project

Group 3: Dave Coomes, Quinhas Fernndes, Isabella Sun, Long Zong

This tutorial includes 8 figures:

  1. Univariate: Mozambique Community Health Workers contribuition on Family Planning

  2. Univariate: Province level gap to achieve under five mortality reduction goals

  3. Bivariate: Under Five Mortality Distribution by Region in Mozamique

  4. Multivariate: Radar plot- Main study variables by province

  5. Multivariate: Factors that may influence the mortality rate in Mozambique over time

  6. Multivariate: Regression Confidence Intervals - Effect of Health Human resources and Service Utilization on Child Mortality

  7. Maps: Under 5 mortality rate by province in 2010

  8. Maps: Change in under 5 mortality rate from 2000-2010 by province

For this final project, we used data from two sources. The first is the Under Five Mortality Dataset. The second is the Community Health Workers dataset.

Under Five Mortality dataset: Provincial-level under-5, infant, and neonatal mortality from 2000 to 2010 in Mozambique was estimated using data from the 2003 and 2011 DHS and the 2008 Multiple Indicator Cluster Survey (MICS). These three datasets were merged and used to calculate the provincial-level probability of a child dying for each year of the 11-year period using direct life-table estimation methods.

Community Health Workers dataset: To understand the extent to which the program implementation was achieving the expected results we used data from two main sources (aggregated in one dataset), the Health Information System and the population census. The HIS system provided the number of CHW by province and new users of family planning. We abstracted from the census the number of women in reproductive age.

You can access the github repository HERE In the repository, you can download the R markdown file used to create this page as well as the two main datasets used, and shapefiles for mapping.


Beginning the code to produce data visualizations


Before we begin to create the visualizations, we will need to install and load the necessary libraries.

knitr::opts_chunk$set(echo = TRUE, warnings=FALSE, message=FALSE)

if (!require(ggrepel)) install.packages("ggrepel", repos = "http://cran.us.r-project.org", dependencies = TRUE)
if (!require(ggiraph)) install.packages("ggiraph", repos = "http://cran.us.r-project.org", dependencies = TRUE)
if (!require(ggiraphExtra)) install.packages("ggiraphExtra", repos = "http://cran.us.r-project.org", dependencies = TRUE)
if (!require(dplyr)) install.packages("dplyr", repos = "http://cran.us.r-project.org", dependencies = TRUE)
if (!require(dotwhisker)) install.packages("dotwhisker", repos = "http://cran.us.r-project.org", dependencies = TRUE)
if (!require(openxlsx)) install.packages("openxlsx", repos = "http://cran.us.r-project.org", dependencies = TRUE)
library(ggplot2)
library(foreign)
library(haven)
library(uwIntroStats)
library(dplyr)
library(readr)
library(ggiraph)
library(ggiraphExtra)
library(foreign)
library(dotwhisker)
library(broom)
library(dplyr)
library(openxlsx)


Univariate data visualizations


Univariate #1: Lolliplot

Community health worker contribution to family planning


First, we bring in the dataset we will use.

link="https://github.com/quinhasf/pubpol-599/raw/master/ape_analysis.dta"
chw_fp <- read_dta(url(link))

We will be using the ggplot package to create the visualization. Ggplot can only read the data in a dataframe (as opposed to a table/vector).

#First we only keep the relevant variables and put it in a new data frame
tableFreq <- data.frame(chw_fp[c('province', 'ape_contrib')])
#Give the data frame columns a name
names(tableFreq)=c("province","ape_contrib")

We want our visualization to be ordered, so we create a data frame that is ordered.

tableFreqO=tableFreq[order(tableFreq$ape_contrib),]

The target contribution for community health workers to family planning is 10. So, we add new variables to the data frame indicating whether each province is above the target or below the target.

tableFreqO$gap=tableFreqO$ape_contrib-10
tableFreqO$Target=ifelse(tableFreqO$gap>0,"Above Target","Below Target")

We want to define various plot elements such as the title, subtitle, caption, and source.

loli_title = "Mozambique Community Health Workers contribuition on Family Planning"
loli_subtitle = "2017 province level gap on CHW contribuition"
loli_caption = "Fig.1: Represents the contirbution of each province to achieve the 10% target (centered at 0)
in 2017 (Gap analysis). Provinces are ploted from low to high perfomance.
Source:Health Information System"

Finally, we are ready to create the lollipop plot!

First a basic plot.

base = ggplot(tableFreqO, aes(province,gap,color=Target,
                              label = round(gap,1))) 
lolliplot1=base + geom_segment(aes(y =0, 
                                   x = province, 
                                   yend = gap, 
                                   xend = province)) 
lolliplot2=lolliplot1 + geom_point()
lolliplot3= lolliplot2 + scale_x_discrete(limits=tableFreqO$province) #key element 

Then we want to properly label the axis, show specific numbers, and generally and make it readable. We also want to include the various plot elements we defined before such as title, caption, source etc.

lollitplot4 = lolliplot3 + geom_text(size=3, nudge_x=0.35, nudge_y=0.1,show.legend = FALSE) +
              labs(title = loli_title,
                   subtitle = loli_subtitle,
                    x ="Province", 
                    y = "% points of the FP GAP",
                    caption = loli_caption) +
            theme(panel.background = element_rect(fill = "gray98",
                                                    colour = "black"),
                    plot.caption = element_text(hjust = 0), 
                    plot.title = element_text(hjust = 0.5),
                    plot.subtitle = element_text(hjust=0.5),
                    legend.box.just = c("right","center"), 
                    axis.text.x = element_text(size=7, angle = 60, vjust = 1, hjust=1)) +
              geom_hline(yintercept=0,
                    linetype="dashed",
                    color = "black",
                    size= 0.9,
                    alpha= 0.8)

lollitplot4 


Univariate #2: Lolliplot

Gap to reach under five mortality goal by region


Let’s do it again, but with a new dataset. This dataset is about child mortality rates in Mozambique. Include the link and tell R to read it.

link="https://github.com/ihsun-uw/Group3_Final_Project/raw/master/child_mortality.dta"

df <- read_dta(url(link))

This dataset includes information across different years. For this plot, we are only looking at 2010. So, as before, we want to create a dataframe with relevant information. This relevant information includes whether or not each province achieved its target for child mortality rates.

df_2010 <- subset(df, year==2010)
df_2010=df_2010[order(df_2010$U5M_COMB),]
df_2010$gap <- df_2010$U5M_COMB - 75
df_2010$Target=ifelse(df_2010$gap>0,"Not Achieved","Achieved")

Let’s define some other plot elements such as title, source etc.

loli_title2 = "Mozambique 2010 Gap analysis to achieve MDG 4 Goal"
loli_subtitle2 = "Province level gap to achieve under five mortality reduction goal"
loli_caption2 = "Fig.2: Represents province level gap to achieve 2/3 reduction on under five mortality rate (75 per 1000 live births target centered at 0). \nProvinces below zero have met the 2010 target while those above zero have not.
      
Source: Demographic Health Surveys"

Now let’s create our lollipop plot! This time achieving the target is below the reference line in the plot whereas in our first lolipop plot, provinces achieving the target were above the reference line. Color can be helpful to indicate what is positive or negative. In both cases, red indicates that a province is below the target rate.

base = ggplot(df_2010, 
              aes(x=reorder(provname, gap), 
                  y=gap, 
                  color=Target,
                              label = round(gap,1))) 

lolliplot1=base + geom_segment(aes(y =0, 
                                   x = provname, 
                                   yend = gap, 
                                   xend = provname)) 

lolliplot2=lolliplot1 + geom_point()

lolliplot3= lolliplot2 + scale_x_discrete(limits=df_2010$provname) 

lollitplot4 = lolliplot3 + geom_text(nudge_x=0.35, nudge_y=0.1,show.legend = F, size=3) +
  labs(title = loli_title2,
       subtitle = loli_subtitle2,
       x ="Province", 
       y = "U5 mortality rate gap (per 1000 LB)",
       caption = loli_caption2) +
  theme(panel.background = element_rect(fill = "grey96",
                                        colour = "grey50"),
        plot.caption = element_text(hjust = 0, size = 8), 
        plot.title = element_text(hjust = 0.5, size=12, face="bold"),
        plot.subtitle = element_text(hjust=0.5, face="bold"),
        legend.box.just = c("right","center"), 
        axis.text.y = element_text(size=7),
        axis.text.x = element_text(size=7, angle = 45, vjust = 1, hjust=1),
        axis.title.x = element_text(size=10),
        axis.title.y = element_text(size=8)) +
  geom_hline(yintercept=0,
             linetype="dashed",
             color = "black",
             size= 0.9,
             alpha= 0.8)

lollitplot4 


Bivariate data visualizations


Bivariate #1: Box plot

Under five mortality rate by region


Let’s create a few box plots to show the distribution of under five mortality in the difference regions in Mozambique.

So we don’t mess with the original data, let’s create a new dataframe.

child_mort=df

As always, lets define a few misc. plot elements

box_title = "Under Five Mortality Distribuition by Region in Mozambique"
box_caption = "Fig.3 Represents the distribution of under five mortality rates (per 1000 LB) in different regions by province.
      
       
       Source: Demographic Health Surveys"

Now we can plot. Note that we wanted the plot to be ordered with provinces that had the highest child mortality rates on the left and in decreasing order to the right.

base = ggplot(df, aes(x=reorder(provname,-U5M_COMB), y=U5M_COMB)) #This is where we designate that we want the plot to be ordered
biv_boxplot = base + geom_boxplot(varwidth=T, fill="grey80") + 
      labs(title=box_title, 
         caption=box_caption,
         x="Province",
         y="Under Five Mortality (per thousand)") +
   theme(panel.background = element_rect(fill = "gray98",
                                                    colour = "black"),
                    plot.caption = element_text(hjust = 0), 
                    plot.title = element_text(hjust = 0.5),
                    plot.subtitle = element_text(hjust=0.5),
                    legend.box.just = c("right","center"), 
                    axis.text.x = element_text(size=7, angle = 45, vjust = 1, hjust=1))

biv_boxplot


Multivariate data visualizations


Multivariate #1: Radar plot

Plot of main variables by province


Let’s create some mutivariate plots. The first plot is a radar plot.

First we will need to create a data frame with the relevant information. We want the average of various variables (Health Professional Present at Birth, Health Worker Density, Midwife Density, Medical Doctor Density) by province.

df1=df

df_aggre1 <- aggregate(cbind(birthatend, hwdensity, midwifedensity, mddensity)~provname, data=df1, FUN=mean )

We are choosing to exclude one outlier province. The capital of Mozambique is doing well in all of these different factors

df_aggre1 <- df_aggre1[-c(5), ] #Drop 5th observation

We want our various radar plots to appear in order, so we want to rank the provinces by these factors

# get minimun value by row
df_aggre1$min=apply(df_aggre1[,c(2:5)],1,min)

# turn this min values into a ranking
df_aggre1$min=rank(df_aggre1$min,ties.method ='first' )

# order city by ranking
prov_fact=as.factor(df_aggre1[order(df_aggre1$min),]$provname)

# turn city into ordered factor
df_aggre1$provname=factor(df_aggre1$provname,
                   levels= prov_fact,
                   labels = prov_fact,
                   ordered = T)

Now that it is ordered, we can clean up the data frame before we plot. We don’t need the ranks and we want the column names to be labeled properly so they make sense in the plot.

# delete column with ranks
df_aggre1$min=NULL

colnames(df_aggre1) <- c("provname", "Health Professional Present at Birth", "Health Worker Density", "Midwife Density", "Medical Doctor Density")

Now we can create our radar plot. Remember to give you plot a title and appropriate caption!

radar_title= "Radar plot: Province Main Study Variables"
radar_caption = "Fig.4: Describes how provinces are performing on study variables : Health professional present at birth, health worker density, midwife density, and medical doctor density. 

Source:Demographic Health Surveys"
  
  
base = ggRadar(df_aggre1,aes(group='provname'),legend.position="none") 

radar1 = base + facet_wrap(~provname,nrow =2) +
    labs(title = radar_title,
                    caption = radar_caption) +
              theme(panel.background = element_rect(fill = "gray90"),
                    plot.caption = element_text(hjust = 0, size = 10), 
                    plot.title = element_text(hjust = 0.5, size=14)) 

radar1 


Multivariate #2: Scatter Plot

Scatter plot of factors that may influence u5 mortality


We can also create a multivariate plot to look at what is happening with these various factors over time.

base = ggplot(df1, aes()) + 
  geom_point(aes(x=year, y=hwdensity, color="Health Worker Density"), size = 0.8, alpha=1/3) + 
  geom_smooth(aes(x=year, y=hwdensity), method="loess", se=T, size=1, color="purple") +
  geom_point(aes(x=year, y=midwifedensity, color= "Midwife Density"), size = 0.8, alpha=1/4) + 
  geom_smooth(aes(x=year, y=midwifedensity), method="loess", se=T, size=1, color="green") +
  geom_point(aes(x=year, y=mddensity, color= "Medical Doctor Density"), size = 0.8, alpha=1/5) +
  geom_smooth(aes(x=year, y=mddensity), method="loess", se=T, size=1, color="red")

Create plot of factors that can affect mortality rates

scat_title = "Factors that may Influence the Mortality Rate in Mozambique over time"
scat_caption = "Fig. 5 This figure represents trends in three key factors (health worker density, midwife density, and medical 
doctor density) from 2000 to 2010. 

Source: Demographic Health Surveys"

scat = base + geom_smooth(aes(x=year, y=mddensity), method="loess", se=T, size=1, color="red") +
  labs(title = scat_title,
       caption = scat_caption, 
                    y =" Variable Density (per 1000 live births)", 
                    x = "Years")+
  theme(legend.position = "bottom",
  plot.caption = element_text(hjust = 0), 
                    plot.title = element_text(hjust = 0.5)) +
  scale_x_continuous(breaks = c(2000,2002,2004,2006,2008,2010))  + theme(legend.title=element_blank())

scat 


Multivariate #3: Regression

Effect of health human resources factors on child mortality


Now that we have examined these different variables. Let’s run a regression to see if there is any statistically significant association between these factors and child mortality rates.

We will run 3 different models. In the first model, our left hand side variable is neonatal mortality. In the second model, we use child mortality. In the third model, we use under-five mortality.

After running the regression models, we want to save the information in data frames to plot.

##Model 1
model1=lm(NEONAT_COMB ~ birthatend + midwifedensity + hwdensity + facpop, data=df1)

model1_t = tidy(model1) %>%   
    mutate(model = "Neonatal Mortality") 

##Model 2
model2=lm(INFANT_COMB ~ birthatend + midwifedensity + hwdensity + facpop, data=df1)

model2_t = tidy(model2) %>%  
    mutate(model = "Child Mortality") 

##Model 3
model3=lm(U5M_COMB ~ birthatend + midwifedensity + hwdensity + facpop, data=df1)

model3_t = tidy(model3) %>%  
    mutate(model = "Under-Five Mortality") 

Now we combine the data frames into one.

allModels=rbind(model1_t, model2_t, model3_t)

Now we can create the visualizatios of the regression coefficients with confidence intervals so that we can easily see which factors statistically significantly impact mortality rates.

reg_title = "Effect of Health Human resources and Service 
           Utilization on Child Mortality"
reg_caption = "Fig.6: Show regression coeficientes and 95% CI for birth atendance,
midewife desnsity and total health human resources on child mortality.

Source: DHS"

We also want to change the axis to use titles that make sense to a reader.

allModels$term[allModels$term=="birthatend"] <- "Health Professional Present at Birth"
allModels$term[allModels$term=="midwifedensity"] <- "Midwife Density"
allModels$term[allModels$term=="hwdensity"] <- "Health Worker Density"
allModels$term[allModels$term=="facpop"] <- "Health Facility Density"

Plot the three models side by side and change the colors to black.

regplot = dwplot(allModels, dot_args = list(color="black"), whisker_args = list(color="black")) + facet_wrap(~model) 

We do not need the legend because each pannel is titled. We also want to add a reference line at 0 to quickly see which variables are statistically significantly different than 0.

regplot1 = regplot + theme(legend.position="none") +geom_vline(xintercept = 0, 
               colour = "black", 
               linetype = 4)

Give the plot a title and caption

regplot2 = regplot1 + labs(title = reg_title,
                    caption = reg_caption) +  
          theme(panel.background = element_rect(fill = "gray97",
                                                    colour = "black"),
                    plot.caption = element_text(hjust = 0, size = 8), 
                    plot.title = element_text(hjust = 0.5, size=12)) +
          theme(axis.text.y = element_text(angle = 45, vjust = 0.6)) 

regplot2


MAPS


Map #1

Map of under five mortality by province


To begin the map section we will first install some necessary packages

knitr::opts_chunk$set(warning=FALSE, include=TRUE)

if (!require(utilsIPEA)) install.packages("utilsIPEA", repos = "http://cran.us.r-project.org", dependencies=TRUE, lib="C:/Users/dcoomes/Documents/R/win-library/3.5")
if (!require(rgdal)) install.packages("rgdal", repos = "http://cran.us.r-project.org", dependencies=TRUE, lib="C:/Users/dcoomes/Documents/R/win-library/3.5")
if (!require(raster)) install.packages("raster", repos = "http://cran.us.r-project.org", dependencies=TRUE, lib="C:/Users/dcoomes/Documents/R/win-library/3.5")
if (!require(tmaptools)) install.packages("tmaptools", repos = "http://cran.us.r-project.org", dependencies=TRUE, lib="C:/Users/dcoomes/Documents/R/win-library/3.5")
if (!require(leaflet)) install.packages("leaflet", repos = "http://cran.us.r-project.org", dependencies=TRUE, lib="C:/Users/dcoomes/Documents/R/win-library/3.5")
if (!require(RColorBrewer)) install.packages("RColorBrewer", repos = "http://cran.us.r-project.org", dependencies=TRUE, lib="C:/Users/dcoomes/Documents/R/win-library/3.5")
if (!require(classInt)) install.packages("classInt", repos = "http://cran.us.r-project.org", dependencies=TRUE, lib="C:/Users/dcoomes/Documents/R/win-library/3.5")
if (!require(dplyr)) install.packages("dplyr", repos = "http://cran.us.r-project.org", dependencies=TRUE, lib="C:/Users/dcoomes/Documents/R/win-library/3.5")
library(utilsIPEA)
library(rgdal)
library(raster)
library(tmaptools)
library(leaflet)
library(RColorBrewer)
library(classInt)

Next, we bring in the child mortality data

link="https://github.com/ihsun-uw/Group3_Final_Project/raw/master/child_mortality.dta"
df <- read_dta(url(link))

Then, we bring in our map data. The map data is stored in a zip file in github. To access it we’ll have to download and unzip it.

zip_mozmap_SHP = "https://github.com/ihsun-uw/Group3_Final_Project/raw/master/Mozambique%20shape%20maps.zip"

Downloading the shape files

library(utils)
temp=tempfile()
download.file(zip_mozmap_SHP, temp)
unzip(temp)

Showing which tempfiles (our shape file) is in the directory

(maps=list.files(pattern = 'shp'))
## [1] "MOZ-level_1.shp"

Reading the shape file into R

library(rgdal)
mozzipMap <- readOGR("MOZ-level_1.shp",stringsAsFactors=F) 

The region is coded differently in the shape file data and the mortality data. We are making sure the provinces are coded the same for both data sets.

chw_fp$province_num <- chw_fp$province
chw_fp$province_num[chw_fp$province=="CABO DELGADO"] <- 1
chw_fp$province_num[chw_fp$province=="GAZA"] <- 2
chw_fp$province_num[chw_fp$province=="INHAMBANE"] <- 3
chw_fp$province_num[chw_fp$province=="MANICA"] <- 4
chw_fp$province_num[chw_fp$province=="MAPUTO PROVINCIA"] <- 5
chw_fp$province_num[chw_fp$province=="NAMPULA"] <- 6
chw_fp$province_num[chw_fp$province=="NIASSA"] <- 7
chw_fp$province_num[chw_fp$province=="SOFALA"] <- 8
chw_fp$province_num[chw_fp$province=="TETE"] <- 9
chw_fp$province_num[chw_fp$province=="ZAMBEZIA"] <- 10


mozzipMap$province_num[mozzipMap$ID=="Cabo Delgado"] <- 1
mozzipMap$province_num[mozzipMap$ID=="Gaza"] <- 2
mozzipMap$province_num[mozzipMap$ID=="Inhambane"] <- 3
mozzipMap$province_num[mozzipMap$ID=="Manica"] <- 4
mozzipMap$province_num[mozzipMap$ID=="Maputo"] <- 5
mozzipMap$province_num[mozzipMap$ID=="Nampula"] <- 6
mozzipMap$province_num[mozzipMap$ID=="Niassa"] <- 7
mozzipMap$province_num[mozzipMap$ID=="Sofala"] <- 8
mozzipMap$province_num[mozzipMap$ID=="Tete"] <- 9
mozzipMap$province_num[mozzipMap$ID=="Zambezia"] <- 10
mozzipMap$province_num[mozzipMap$ID=="Maputo (city)"] <- 11


df$province_num[df$provname=="Cabo Delgado"] <- 1
df$province_num[df$provname=="Gaza"] <- 2
df$province_num[df$provname=="Inhambane"] <- 3
df$province_num[df$provname=="Manica"] <- 4
df$province_num[df$provname=="Maputo Provincia"] <- 5
df$province_num[df$provname=="Nampula"] <- 6
df$province_num[df$provname=="Niassa"] <- 7
df$province_num[df$provname=="Sofala"] <- 8
df$province_num[df$provname=="Tete"] <- 9
df$province_num[df$provname=="Zambezia"] <- 10
df$province_num[df$provname=="Maputo Cidade"] <- 11

#creating new data frame with just the year 2010
df_new <- df[111:121,1:26]

We also want to create a new variable that is the change in mortality from 2000 to 2010

df_new$u5_change[df_new$province_num==1] <- 100*(df_new$U5M_COMB[df_new$province_num==1] - df$U5M_COMB[df$province_num==1 & df$year==2000])/df$U5M_COMB[df$province_num==1 & df$year==2000]
df_new$u5_change[df_new$province_num==2] <- 100*(df_new$U5M_COMB[df_new$province_num==2] - df$U5M_COMB[df$province_num==2 & df$year==2000])/df$U5M_COMB[df$province_num==2 & df$year==2000]
df_new$u5_change[df_new$province_num==3] <- 100*(df_new$U5M_COMB[df_new$province_num==3] - df$U5M_COMB[df$province_num==3 & df$year==2000])/df$U5M_COMB[df$province_num==3 & df$year==2000]
df_new$u5_change[df_new$province_num==4] <- 100*(df_new$U5M_COMB[df_new$province_num==4] - df$U5M_COMB[df$province_num==4 & df$year==2000])/df$U5M_COMB[df$province_num==4 & df$year==2000]
df_new$u5_change[df_new$province_num==5] <- 100*(df_new$U5M_COMB[df_new$province_num==5] - df$U5M_COMB[df$province_num==5 & df$year==2000])/df$U5M_COMB[df$province_num==5 & df$year==2000]
df_new$u5_change[df_new$province_num==6] <- 100*(df_new$U5M_COMB[df_new$province_num==6] - df$U5M_COMB[df$province_num==6 & df$year==2000])/df$U5M_COMB[df$province_num==6 & df$year==2000]
df_new$u5_change[df_new$province_num==7] <- 100*(df_new$U5M_COMB[df_new$province_num==7] - df$U5M_COMB[df$province_num==7 & df$year==2000])/df$U5M_COMB[df$province_num==7 & df$year==2000]
df_new$u5_change[df_new$province_num==8] <- 100*(df_new$U5M_COMB[df_new$province_num==8] - df$U5M_COMB[df$province_num==8 & df$year==2000])/df$U5M_COMB[df$province_num==8 & df$year==2000]
df_new$u5_change[df_new$province_num==9] <- 100*(df_new$U5M_COMB[df_new$province_num==9] - df$U5M_COMB[df$province_num==9 & df$year==2000])/df$U5M_COMB[df$province_num==9 & df$year==2000]
df_new$u5_change[df_new$province_num==10] <- 100*(df_new$U5M_COMB[df_new$province_num==10] - df$U5M_COMB[df$province_num==10 & df$year==2000])/df$U5M_COMB[df$province_num==10 & df$year==2000]
df_new$u5_change[df_new$province_num==11] <- 100*(df_new$U5M_COMB[df_new$province_num==11] - df$U5M_COMB[df$province_num==11 & df$year==2000])/df$U5M_COMB[df$province_num==11 & df$year==2000]

Now that we have the data cleaned, we can create the first map of the under 5 mortality rate by province. We start by merging the map with the mortality data frame.

layerContrib_2=merge(mozzipMap,df_new, by.x='province_num', by.y='province_num',all.x=F)

Next, we define which variable in the data frame we want to plot on the map.

varToPlot_2=layerContrib_2$U5M_COMB

Then we set the colors for the map and the interval for the mortality data.

numberOfClasses=5
colorForScale='OrRd'
colors = brewer.pal(numberOfClasses, colorForScale)
intervals <- classIntervals(varToPlot_2, numberOfClasses,
                            style = "kmeans",
                            dataPrecision = 2)
                           
colorPallette <- findColours(intervals, colors)

Finally, we plot the map.

legendText="u5 mortality rate"
shrinkLegend=1
title="Under 5 mortality rate in Mozambique by province (2010)"
subtitle="Figure 7: Shows the under 5 mortality rate by province (including the capital) \nin Mozambique in 2010

Source: DHS"


# first the ORIGINAL to signal missing values:
plot(mozzipMap,col='red',main=title, border="black", lwd=1) 

title(sub=subtitle, adj=0) 

# now the info on u5 mortality
plot(layerContrib_2, col = colorPallette,border=NA,add=T) #add

legend('topright', 
       legend = names(attr(colorPallette, "table")), #values
       fill = attr(colorPallette, "palette"), #colors
       cex = shrinkLegend, #size 
       bty = "n", # no box
       title=legendText)


MAP #2

Map of the change in the under 5 mortality rate by region


Generating one more data set for mapping - change in u5 MR

layerContrib_3=merge(mozzipMap,df_new, by.x='province_num', by.y='province_num',all.x=F)

Defining the variable we want to plot

varToPlot_3=layerContrib_2$u5_change

Defining the color of the map and the intervals

numberOfClasses=5
colorForScale='YlOrRd'
colors = brewer.pal(numberOfClasses, colorForScale)
intervals <- classIntervals(varToPlot_3, numberOfClasses,
                            style = "kmeans",
                            dataPrecision = 0)
                           
colorPallette <- findColours(intervals, colors)

Finally, plotting the second map

legendText="Change in u5 mortality"
shrinkLegend=1
title="Percent change in u5 mortality rate in Mozambique by province"
subtitle="Figure 8: Shows the percent change in under five mortality by province in \nMozambique from 2000-2010.

Source: DHS"

plot(mozzipMap,col='red',main=title, border="black", lwd=1) 

plot(layerContrib_3, col = colorPallette,border=NA,add=T) #add

title(sub=subtitle, adj=0) 

legend('topright', 
       legend = names(attr(colorPallette, "table")), #values
       fill = attr(colorPallette, "palette"), #colors
       cex = shrinkLegend, #size 
       bty = "n", # no box
       title=legendText)